Fix build and update to CUDA 12.4 #1925

jordimas · 2025-11-02T16:06:11Z

Changes:

Update Windows image to 2022 (2019 deprecated and no longer available)
Update from Python 3.9 (which reach end of live 2025-10-31) to Python 3.10
Update Windows and Linux from CUDA 12.2 to CUDA 12.4
Updated OneAPI to 2024.0 (previous older SDK was given 404)
Fixes to compile in MacOS

3manifold · 2025-11-05T14:32:45Z

@jordimas PR #1905 was solving the same issue. I also explained all the error details there. Given that you're the new admin, it is suspicious the fact that you ignored that PR (the only green CI for months) and decided to create a duplicate PR. 😡 You did not event leave a comment or mention that PR 👎

3manifold · 2025-11-05T14:39:50Z

You even "fixed" the CMake warnings correct? 😠 There is also a ticket and a PR for that as well #1899 #1907

ozancaglayan · 2025-11-12T16:35:55Z

How does this interplay with the fact that one loads CTranslate2 from a docker image where CUDA runtime installed is >= 12.4? Why do we compile against 12.4 still and also how does the CUDA_DYNAMIC_LOADING cmake flag interplay with the toolkit version that the package is compiled against?

jordimas · 2025-11-13T08:11:25Z

Hello. None of the changes that I did impacts the Docker container. It compiles and runs still with 12.2 and works fine. If you have observed any problem please share the details.

Here there is a new Dockerfile that supports Cuda 12.4:
#1932

I will appreciate if you can test it.

3manifold · 2025-11-13T08:18:10Z

Probably @ozancaglayan implies that python/tools/prepare_build_environment_linux.sh has to be synced with docker/Dockerfile since there's an overlap in their instructions. Ideally, some functions in python/tools/prepare_build_environment_linux.sh can be refactored and reused in docker/Dockerfile.

ozancaglayan · 2025-11-13T11:42:31Z

Sorry for being unclear:

I build my own docker container for CTranslate2 & Faster Whisper
I've beeing using CUDA-12.8 runtime dependencies there through apt-get without any issues
I'm not building CTranslate2 there from source just installing it through pip

But in this repo and related to this PR, I see that there's an explicit compilation stage which was using CUDA 12.2 and now is updated to CUDA 12.4.

Probably all minor releases of CUDA 12 are compatible and one can switch to a newer one at runtime, right?

If we were to try CUDA-13 for example, what would be the steps for that?

Purfview · 2025-11-13T11:51:39Z

What about this issue [int8 doesn't work on 50xx GPUs]: #1865

jordimas · 2025-11-13T12:17:43Z

Hello @ozancaglayan

I am not familiar with CUDA-18 runtime, to my knowledge the latest version is 13.
Since CTranslate2 is compiled with Cuda 12.4 you need a driver that is compatible with version 12.
I cannot tell about your specific combinations, but suggestion is that you tried and if does not work please open an issue: https://github.com/OpenNMT/CTranslate2/issues Thanks

ozancaglayan · 2025-11-13T12:52:24Z

Sorry I meant CUDA 12.8 🤦

BBC-Esq · 2025-11-13T13:12:19Z

When it comes to CUDA compatibility, review the following for full compatibility issues/conflicts. I'm also including compatibility regarding other well-known libraries like flash attention 2, triton, etc, since most programs typically use all of the above. any way to make ctranslate2 more compatible??

NOTE, I only do Windows since that's all I can test...but shouldn't be too hard to pull similar info for Linux users:

****************************
Torch and CUDA Compatibility
****************************
+-------+-------+--------+----------+
| Torch | Wheel | CUDA   | cuDNN    |
+-------+-------+--------+----------+
| 2.9.0 | cu130 | 13.0.0 | 9.13.0.50|
|       | cu128 | 12.8.1 | 9.10.2.21|
|       | cu126 | 12.6.3 | 9.10.2.21|
+-------+-------+--------+----------+
| 2.8.0 | cu129 | 12.9.1 | 9.10.2.21|
|       | cu128 | 12.8.1 | 9.10.2.21|
|       | cu126 | 12.6.3 | 9.10.2.21|
+-------+-------+--------+----------+
| 2.7.1 | cu128 | 12.8.0 | 9.5.1.17 |
|       | cu126 | 12.6.3 | 9.5.1.17 |
+-------+-------+--------+----------+
| 2.7.0 | cu128 | 12.8.0 | 9.5.1.17 |
|       | cu126 | 12.6.3 | 9.5.1.17 |
+-------+-------+--------+----------+
| 2.6.0 | cu126 | 12.6.3 | 9.5.1.17 |
|       | cu124 | 12.4.1 | 9.1.0.70 |
+-------+-------+--------+----------+
# Python: 2.6-2.8 (>=3.9, <=3.13), 2.9 (>=3.10, <=3.14)
# https://github.com/pytorch/pytorch/blob/main/.github/scripts/generate_binary_build_matrix.py
# https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix


*****************************************
Metapackage Versions Within CUDA Releases
*****************************************
+----------+------------+--------------+-----------+-----------+-----------+
| CUDA Ver | cuda-nvrtc | cuda-runtime | cuda-nvcc |  cublas   |   cufft   |
+----------+------------+--------------+-----------+-----------+-----------+
| 12.6.3   | 12.6.77    | 12.6.77      | 12.6.77   | 12.6.4.1  | 11.3.0.4  |
| 12.8.0   | 12.8.61    | 12.8.57      | 12.8.57   | 12.8.3.14 | 11.3.3.41 |
| 12.8.1   | 12.8.93    | 12.8.90      | 12.8.93   | 12.8.4.1  | 11.3.3.83 |
| 12.9.1   | 12.9.86    | 12.9.79      | 12.9.79   | 12.9.1.4  | 11.4.1.4  |
| 13.0.0   | 13.0.48    | 13.0.48      | 13.0.48   | 13.0.0.19 | 12.0.0.15 |
| 13.0.1   | 13.0.88    | 13.0.88      | 13.0.88   | 13.0.2.14 | 12.0.0.61 |
| 13.0.2   | 13.0.88    | 13.0.96      | 13.0.88   | 13.1.0.3  | 12.0.0.61 |
+----------+------------+--------------+-----------+-----------+-----------+
# https://docs.nvidia.com/cuda/archive/12.6.3/cuda-toolkit-release-notes/index.html
# https://developer.download.nvidia.com/compute/cuda/redist/


****************************************
Torch Compatibility with Python & Triton
****************************************
+-------+------------------------+--------+----------+
| Torch | CUDA Versions          | Triton | Sympy    |
+-------+------------------------+--------+----------+
| 2.9.0 | 12.6, 12.8, 12.9, 13.0 | 3.5.0  | >=1.13.3 |
| 2.8.0 | 12.6, 12.8, 12.9       | 3.4.0  | >=1.13.3 |
| 2.7.1 | 12.6, 12.8             | 3.3.1  | >=1.13.3 |
| 2.7.0 | 12.6, 12.8             | 3.3.0  | >=1.13.3 |
| 2.6.0 | 12.4, 12.6             | 3.2.0  | 1.13.1   |
+-------+------------------------+--------+----------+
* Triton 3.1.0 and later wheels: https://github.com/woct0rdho/triton-windows/releases (supports Python 3.12)
* Since triton-windows==3.2.0.post11, windows wheels are published to https://pypi.org/project/triton-windows/
# The METADATA file for each torch wheel shows its compatibility with Python, Triton, and Sympy


*************************
WINDOWS Flash Attention 2
*************************
+-----------------+--------+---------+--------+
| Flash Attention | Python | PyTorch | CUDA   |
+-----------------+--------+---------+--------+
| 2.8.2 & 2.8.3   | 3.10   | 2.6.0   | 12.4.1 |
|                 | 3.11   | 2.6.0   | 12.4.1 |
|                 | 3.12   | 2.6.0   | 12.4.1 |
|                 | 3.13   | 2.6.0   | 12.4.1 |
+-----------------+--------+---------+--------+
| 2.8.2 & 2.8.3   | 3.10   | 2.7.0   | 12.8.1 |
|                 | 3.11   | 2.7.0   | 12.8.1 |
|                 | 3.12   | 2.7.0   | 12.8.1 |
|                 | 3.13   | 2.7.0   | 12.8.1 |
+-----------------+--------+---------+--------+
| 2.8.2 & 2.8.3   | 3.10   | 2.8.0   | 12.8.1 |
|                 | 3.11   | 2.8.0   | 12.8.1 |
|                 | 3.12   | 2.8.0   | 12.8.1 |
|                 | 3.13   | 2.8.0   | 12.8.1 |
+-----------------+--------+---------+--------+
# https://github.com/kingbri1/flash-attention


********
Xformers
********
+------------------+-------+---------------+----------------+---------------+
| Xformers Version | Torch |      FA2      |       CUDA (excl. 11.x)        |
+------------------+-------+---------------+--------------------------------+
| v0.0.32.post2    | 2.8.0 | 2.7.1 - 2.8.2 | 12.8.1, 12.9.0                 |
| v0.0.32.post1    | 2.8.0 | 2.7.1 - 2.8.2 | 12.8.1, 12.9.0                 |
| v0.0.32          | 2.7.1 | 2.7.1 - 2.8.2 | 12.8.1, 12.9.0                 | * BUG
| v0.0.31.post1    | 2.7.1 | 2.7.1 - 2.8.0 | 12.8.1                         |
| v0.0.31          | 2.7.1 | 2.7.1 - 2.8.0 | 12.6.3, 12.8.1                 |
| v0.0.30          | 2.7.0 | 2.7.1 - 2.7.4 | 12.6.3, 12.8.1                 |
| v0.0.29.post3    | 2.6.0 | 2.7.1 - 2.7.2 | 12.1.0, 12.4.1, 12.6.3, 12.8.0 |
| v0.0.29.post2    | 2.6.0 | 2.7.1 - 2.7.2 | 12.1.0, 12.4.1, 12.6.3, 12.8.0 |
+------------------+-------+---------------+--------------------------------+
* Torch support: https://github.com/facebookresearch/xformers/blob/main/.github/workflows/wheels.yml
* FA2 support: https://github.com/facebookresearch/xformers/blob/main/xformers/ops/fmha/flash.py
* CUDA support: https://github.com/facebookresearch/xformers/blob/main/.github/actions/setup-build-cuda/action.yml
* ```

BBC-Esq · 2025-11-13T13:17:09Z

@jordimas PR #1905 was solving the same issue. I also explained all the error details there. Given that you're the new admin, it is suspicious the fact that you ignored that PR (the only green CI for months) and decided to create a duplicate PR. 😡 You did not event leave a comment or mention that PR 👎

I give you credit man. I've notice that sometimes people contribute their free time to create a PR only to have someone create another PR that's 95% similar all on their own...and then no recognition.

At the same time, glad that the peeps at Ctranslate2 are finally letting someone actually update things for this great repository...and I'm sure @jordimas (who I've communicated with before, good guy) isn't getting paid to do this sort of this for the repo either so...lol.

Fix build and update to CUDA 12.4

4f40277

jordimas changed the title ~~WIP: Fix build and update to CUDA 12.4~~ Fix build and update to CUDA 12.4 Nov 2, 2025

This was referenced Nov 3, 2025

Support for Python 3.14 #1926

Merged

Request for admin permission to the CTranslate2 project #1922

Closed

jordimas merged commit ac8b6e7 into OpenNMT:master Nov 3, 2025
14 checks passed

jordimas deleted the fix_ci_cuda_12_4 branch November 4, 2025 20:39

3manifold mentioned this pull request Nov 6, 2025

Compatibility with CMake < 3.5 has been removed from CMake. #1899

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix build and update to CUDA 12.4 #1925

Fix build and update to CUDA 12.4 #1925

jordimas commented Nov 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

3manifold commented Nov 5, 2025 •

edited

Loading

Uh oh!

3manifold commented Nov 5, 2025

Uh oh!

ozancaglayan commented Nov 12, 2025

Uh oh!

jordimas commented Nov 13, 2025

Uh oh!

3manifold commented Nov 13, 2025

Uh oh!

ozancaglayan commented Nov 13, 2025 •

edited

Loading

Uh oh!

Purfview commented Nov 13, 2025

Uh oh!

jordimas commented Nov 13, 2025

Uh oh!

ozancaglayan commented Nov 13, 2025

Uh oh!

BBC-Esq commented Nov 13, 2025

Uh oh!

BBC-Esq commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fix build and update to CUDA 12.4 #1925

Fix build and update to CUDA 12.4 #1925

Conversation

jordimas commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

3manifold commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

3manifold commented Nov 5, 2025

Uh oh!

ozancaglayan commented Nov 12, 2025

Uh oh!

jordimas commented Nov 13, 2025

Uh oh!

3manifold commented Nov 13, 2025

Uh oh!

ozancaglayan commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Purfview commented Nov 13, 2025

Uh oh!

jordimas commented Nov 13, 2025

Uh oh!

ozancaglayan commented Nov 13, 2025

Uh oh!

BBC-Esq commented Nov 13, 2025

Uh oh!

BBC-Esq commented Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jordimas commented Nov 2, 2025 •

edited

Loading

3manifold commented Nov 5, 2025 •

edited

Loading

ozancaglayan commented Nov 13, 2025 •

edited

Loading